
RAW SEQUENCE LISTING 

PATENT APPLICATION: US/10/04 5 , 18 OA 



DATE: 06/03/2002 
TIME: 12:40:19 



<120> 
tion 

<130> 
<140> 
<141> 
<150> 
. <151> 
<150> 
: <151> 
i <150> 
<151> 

; <160> 
. <170> 
' <210> 
I <211> 
i <212> 
I <213> 
! <220> 
f <221> 
: <222> 
i <223> 
t <220> 
) <221> 
) <222> 
. <223> 
[ <220> 
5 <221> 
; <222> 
1 <223> 
) <220> 
L <221> 
I <222> 

1 <223> 
5 <220> 
7 <221> 
3 <222> 
3 <223> 

2 <220> 

3 <221> 
1 <222> 
5 <223> 



Input Set : A:\EP.txt 

Output Set: N:\CRF3\06032002\J045180A.raw 

APPLICANT: Bougueleret, Lydie 
Chumakov, Ilya 

TITLE OF INVENTION: Human Defensin Polypeptide Def-X, Genomic DNA and cDNA, 

Containing Them and Applications to Diagnosis and to Therapeutic Treatment 
FILE REFERENCE: GEN-100D1 

CURRENT APPLICATION NUMBER: US 10/04 5, 180A 

CURRENT FILING DATE: 2001-10-18 

PRIOR APPLICATION NUMBER: US 09/486,580 

PRIOR FILING DATE: 2000-02-25 

PRIOR APPLICATION NUMBER : PCT/FR98/01864 

PRIOR FILING DATE: 1998-08-28 

PRIOR APPLICATION NUMBER: FR 97/1082 3 

PRIOR FILING DATE: 1997-08-29 

NUMBER OF SEQ ID NOS : 12 

SOFTWARE: Patentln version 3.1 

SEQ ID NO: 1 

LENGTH: 4415 

TYPE: DNA 

ORGANISM: Homo sapiens 

FEATURE: V \ 

NAME/KEY: misc_feature \^ 
LOCATION: (1)..(4415) 

OTHER INFORMATION: Def-X genomic sequence 
FEATURE : 

NAME/KEY: misc_feature 
LOCATION: (85).. (85) 

OTHER INFORMATION: n = a, C, g, or t . 
FEATURE : 

NAME/KEY : misc_f eature 
LOCATION: (143).. (143) 

OTHER INFORMATION: n = a, c, g, or t . 
FEATURE : 

NAME/KEY: misc_f eature 
LOCATION: (670).. (670) 

OTHER INFORMATION: n = a, c, g, or t . 
FEATURE : 

NAME/KEY: misc_f eature 
LOCATION: (970).. (970) 

OTHER INFORMATION: n = a, c, g, or t . 
FEATURE : 

NAME/KEY: misc_f eature 
LOCATION: ( 1111 )..( 1111 ) 
OTHER INFORMATION 



= a, c, g, or t. 
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Input Set : A:\EP.txt 

Output Set: N:\CRF3\06032002\J045180A.raw 



68 <220> FEATURE: 

69 <221> NAME/KEY: misc_feature 

70 <222> LOCATION: ( 1150 )..( 1150 ) 

71 <22 3> OTHER INFORMATION: n = a, c, g, or t . 

74 <220> FEATURE: 

75 <221> NAME/KEY: CAAT_signal 

76 <222> LOCATION: { 1711 )..( 1714 ) 

77 <223> OTHER INFORMATION: 

80 <220> FEATURE: 

81 <221> NAME/KEY: TATA_signal 

82 <222> LOCATION: ( 1758 )..( 1767 ) 

83 <223> OTHER INFORMATION: 

86 <220> FEATURE: 

87 <221> NAME/KEY: misc_feature 

88 <222> LOCATION: ( 1780 )..( 1780 ) 

89 <223> OTHER INFORMATION: n = a, c, g, or t . 

92 <220> FEATURE: 

93 <221> NAME/KEY: misc_f eature 

94 <222> LOCATION: { 1836 )..( 1874 ) 

95 <22 3> OTHER INFORMATION: Exon 1 

98 <220> FEATURE: 

99 <221> NAME/KEY: misc_f eature 

100 <222> LOCATION: ( 1875 )..( 1880 ) 

101 <223> OTHER INFORMATION: splice donor site 

104 <220> FEATURE: 

105 <221> NAME/KEY: misc_f eature 

106 <222> LOCATION: ( 1974 )..( 1974 ) 

107 <223> OTHER INFORMATION: n = a, c, g, or t . 

110 <220> FEATURE: 

111 <221> NAME/KEY: misc_feature 

112 <222> LOCATION: ( 2117 )..( 2117 ) 

113 <223> OTHER INFORMATION: n = a, c, g, or t . 

116 <220> FEATURE: 

117 <221> NAME/KEY: misc_f eature 

118 <222> LOCATION: ( 2133 )..( 2133 ) 

119 <223> OTHER INFORMATION: n = a, c, g, or t . 

122 <220> FEATURE: 

123 <221> NAME/KEY: misc_f eature 

124 <222> LOCATION: ( 2155) ..( 2335 ) 

125 <223> OTHER INFORMATION: Alu insertion 

128 <220> FEATURE: 

129 <221> NAME/KEY: misc_f eature 

130 <222> LOCATION: ( 2186 )..( 2186 ) 

131 <223> OTHER INFORMATION: n = a, c, g, or t . 

134 <220> FEATURE: 

135 <221> NAME/KEY: misc_f eature 

136 <222> LOCATION: ( 2191 )..( 2191 ) 

137 <223> OTHER INFORMATION: n = a, c, g, or t. 
140 <220> FEATURE: 
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RAW SEQUENCE LISTING DATE : 06/03/2002 

PATENT APPLICATION: US/10/045, 180A TIME: 12:40:19 

Input Set : A:\EP.txt 

Output Set: N:\CRF3\06032002\J045180A.raw 

141 <221> NAME/KEY: misc_f eature 

142 <222> LOCATION: ( 2 367 ) . . ( 2367 ) 

143 <223> OTHER INFORMATION: n = a, c, g, or t . 

146 <220> FEATURE: 

147 <221> NAME/KEY: mis c_f eature 

148 <222> LOCATION: ( 2710 )..( 2780 ) 

149 <223> OTHER INFORMATION: LI fragment insertion 

152 <220> FEATURE: 

153 <221> NAME/KEY: misc_f eature 

154 <222> LOCATION: ( 3391 )..( 3393 ) 

155 <223> OTHER INFORMATION: splice acceptor site 

158 <220> FEATURE: 

159 <221> NAME/KEY: misc_f eature 

160 <222> LOCATION: ( 3394 )..( 3577 ) 

161 <22 3> OTHER INFORMATION: Exon 2 

164 <220> FEATURE: 

165 <221> NAME/KEY: misc_f eature 

166 <222> LOCATION: ( 3406 )..( 3408 ) 

167 <223> OTHER INFORMATION: Translation initiation codon (ATG) 

170 <220> FEATURE: 

171 <221> NAME/KEY: misc_f eature 

172 <222> LOCATION: ( 3578 )..( 3583 ) 

173 <223> OTHER INFORMATION: splice donor site 

176 <220> FEATURE: 

177 <221> NAME/KEY: misc_f eature 

178 <222> LOCATION: (4123) .. (4123) 

179 <22 3> OTHER INFORMATION: n = a, c, g, or t . 

182 <220> FEATURE : 

183 <221> NAME/KEY: misc_f eature 

184 <222> LOCATION: ( 4161 ) . . ( 416 3 ) 

185 <223> OTHER INFORMATION: splice acceptor site 

188 <220> FEATURE: 

189 <221> NAME/KEY: misc_f eature 

190 <222> LOCATION: (4164 ).. (4379) 

191 <223> OTHER INFORMATION: Exon 3 

194 <220> FEATURE : 

195 <221> NAME/KEY: misc_f eature 

196 <222> LOCATION: ( 4274 )..( 4276 ) 

197 <223> OTHER INFORMATION: Translation termination codon (TAA) 

200 <220> FEATURE: 

201 <221> NAME/KEY: polyA_signal 

202 <222> LOCATION: ( 4374 ) . . ( 4 379 ) 

203 <223> OTHER INFORMATION: 

206 <400> SEQUENCE: 1 

207 acaccatttg tcttcatgta accccattag ctataccctc tagtgcaagg aaaccatagg 6 0 
W--> 209 gcctaggtca caccatgagg ctgcncttac aagttatgca aaaactatgg acttgggaga 12 0 
W--> 211 cctgtgcgta acaacatcac acnccaaatt taaccagctc tccccataac agcacgctca 180 

213 tgtgttactg aggaaatgcc tgtggattgg agtgtgttct gtgtgcagga ggctggtcca 240 
215 ggtttcactt ctgcaggaca ctggacgttt cccaaaacca gcagactttc cccacgtgca 300 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/10/045 , 180A 
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Input Set : A:\EP.txt 

Output Set: N:\CRF3\06032002\J045180A.raw 



217 cacacacccc ttctcatttt gcctctacat ccatatccac tgggcccttc aggcacctac 
219 taatgcccta gaacctaaaa ccatcatctg gggcccagtt ccctgaatgg ccctaatctc 
221 ttcctctgct ggaatgagtc cagtgcccac ttcctccaac ggtgaaattg ctgggctgct 
223 acagatcagg aactcactgc ttcctcatag gggcagccga cttcactgct ctgcaacagc 
225 gaccacccct agcgaggctt gagatgcctc ttgcctcctt aagactgagg gagacgcttc 
227 agctctcact ccactgcccc aagtcctcca cagcgcggtg cctgctgcct tcacacagag 

W--> 229 ctgcaggggn aggtcctgtg tatccggcct gctggaccag cgctgtgcac aaccctccca 
231 tggcaacagt ggctgcccgg cctgcacact gggcttggca acctcgctgt aggtatttat 
233 tccctcagga gtgactgcat tcttttccca tttccagaaa actgatgcca tttacctcac 
235 tatgaggagg aggaggagga ggagggtgga gagtggtaca ttttaaaatg tgcactattc 
237 tccctaggac tccccctcaa ataacccagg agggaccata ccagctcatt cctgtgtatc 

W--> 239 ccaagcatan gagtaatcat cccactcatg ctgagtgtat ggtggccatt aagcctgccc 
241 tgaactggct ttagaacaag gtgtttgagc acacagcacc gtcttgctgc caccttggcc 

W--> 243 ccctcccttg tgagacctct gagacacatt naggtctcac ctaaaaatct caggatttct 

W--> 24 5 aggcccaaan cggtcctaaa aaattgttca gtctgaactc tctaaggtca agagaagagg 
24 7 tggttgctcc ctctaagaaa ccacatgttg catgtacatc cttaattccg gaaagtccaa 
249 caaacctgcc ctgcttagca acacaagccg aggtggtact cctctcaccc gggcattctc 
251 caacacacct gtttgtccaa acagctttga tttgttttta tagttggacc ccaggttccc 
253 aggaggctgg ttcaggccat attccaaatc ctcatctgtg tgtgagtggc attcttagcc 
255 tagcctcctt acagggtgga tactatgata cacagccagg ctgtcccagt ggctttcaat 
257 attcttttgg tccagatagt tcagcctcag caccagtgta ggcatcacag ggtcaattgt 
259 cttaggagtc atggagaatt catagttggt agctacctgg gcctggccag ggctgaccat 
261 agacaaggca tccctctgtg aactcctatt ttaatgccag cttcccaaca aatttctcaa 
263 ctgctcttac cagcaggtat ttaaactact caatagaaag taaccctgaa aattaggaca 

W--> 26 5 cctgttccca aaagaccctt aaatagggga agtcctttcn ctgcttgtgc acagctgctg 
267 atgtggcaac atgaggcctg ggacagggga ctgtcctctg cccactctgg tagcctcacg 
269 tagcttaaca atctgtcagt aatacaatac aaaacttaaa ctttcatact gcggttccac 

W--> 271 ccaggaagct gtgttcccaa tctgacccgt gattatgggg ccacctcaga gggnacccag 
273 tgagggaata ttttgccatc tgggactgtt ggttgctggg ggcagtggct atgagctcag 
275 ttaataaact caagcagttt ccttccaaac acacatgtcc tacttaacgt gtccaacaga 

W--> 277 gatgatcata ctcatangct gctaaaacat tanttttatt ttgagaaaag tctattcatg 

W--> 279 ttcttggccc atggagtttt catttnatta ntttatttat tttgcagaga tggagtctca 
281 ctatgttgct caagctggtc tccaactcct gggctcaagc gatcttccta ctttggcctt 
283 tgaaagcgct gagattgcct gtgtgagcca tcatgggggc tcactggccc actgattaat 

W--> 285 cagattaatt gttttttgct attgaanttg tttgacttcc ttgtatattc ggatatttac 
287 ccattctaac acgtagggtt tgcaaatatt ttctctcatg ttctgtgttg ccttttcact 
289 cagttgatgg tttcctttgc tgtgcaggtg ctttagtgtt caacgcagcc ccgcttgtct 
291 attttccatt ttattgcctg tccctttgat gtcatagcca agaaataatt gcccagatta 
293 atgtcaaaaa gctttatccc tatatattct tctagtagtt tatggtttca gatcttatgt 
295 ttaggtcttc aatccattga gttgattttt gtatgtggta taagaaaaaa gaccacatgt 
297 atacatatct caaattctaa ggtagtatat attagacaca tacaatgtgt ctatttacac 
299 acattgagct gaaaataata aacatatttt tatctttcaa tcaactctat ctctatctca 
301 ctgaacttgt ttcacctata gcctgatgag gttgctgtcc tctctacccc agctcctata 
303 ggagactgct catcccctaa cctcaaaaac cccttcatga gggtgataat gcccttgaat 
305 cctgcaatga attagttctc tactacagtg gaattcaggt ctgttatgag ggtctggatc 
307 tctgaagaga agagctctca ttttcagaaa ataagcagga tttattccct gaaattactg 
309 aattaaatca ctgtttcgat tactttttgc aatattaaaa gtaaatattt aaacaggtaa 
311 aaacagaaat aatggtaggg tccttatcat caccgtgaat tccaagctag catagacact 
313 aaacctagag attcacacta gaatgaaagc tgggagagca gaggagtctc agaaggatgt 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
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RAW SEQUENCE LISTING DATE: 06/03/2002 
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Output Set: N:\CRF3\06032002\J045180A.raw 

315 ggaggccaat ggacacctgc aacctctcca acgaaatgcc tacctcctct cactgcagca 3300 

317 tccatctctg agccttctcg cagcagagct ataaattcag cctggctcct ccgttcccac 3360 

319 acatccactc ctgctctccc tcctctcctc caggtgacta cagttatgag gaccctcacc 3420 

321 ctcctctctg cctttctcct ggtggccctt caggcctggg cagagccgct ccaggcaaga 3480 

323 gctcatgaga tgccagccca gaagcagcct ccagcagatg accaggatgt ggtcatttac 3540 

325 ttttcaggag atgacagctg ctctcttcag gttccaggtg agagatgcca gcatgcagag 3600 

327 ctacagacta gacagaagga caggagacag gctctggaat tggatctcag tggcagatgt 3660 

329 cacttaggtg gctatactta acatctctgg tcctggattt tctcatatct aaatggaata 3720 

331 gagaaccaaa gaaatctaag agatttttct ttctccaaaa acttgattcc aagatatgac 3780 

333 tgtgaaattc actagattta agatataagg agatgctacc tagttccttc tggagccaga 3840 

335 caaacaagct taagtatata ggaaaatatt tcaccctgtc tatataggag gttttagaac 3900 

337 ctggagagga gcctaagaat gtgttcaggt gtgtgtgtga tgggcaggaa tgcagaaaag 3960 

339 tgaagcaaag gagaatgagt ctcgaatcct gtgtgaccag cactgctctg tgtatttatt 4020 

341 cctattgact gagattgttt gtgctaccgg ctgtaataca gccaacatca ctcatcagcc 4080 

343 aacatgtgac ttctccaaga ttccctttac cacccactgc tgnaccccgt actcagtttc 4140 

345 tgatgctctc tctgggtccc caggctcaac aaagggcttg atctgccatt gcagagtact 4200 

347 atactgcatt tttggagaac atcttggtgg gacctgcttc atccttggtg aacgctaccc 4260 

349 aatctgctgc tactaagctt gcagactaga gaaaaagagt tcataatttt ctttgagcat 4320 

351 taaagggaat tgttattctt ataccttgtc ctcgatttcc tgtcctcatc ccaaataaat 4380 

353 acttggtaac atgatttccg ggtttttttt ttttt 4415 

356 <210> SEQ ID NO: 2 

357 <211> LENGTH: 453 

358 <212> TYPE: DNA 

359 <213> ORGANISM: Homo sapiens 

361 <2 2 0> FEATURE: 

362 <221> NAME/KEY: CDS 

363 <222> LOCATION: (52).. (336) 

364 <223> OTHER INFORMATION: Def-X coding sequence 

367 <400> SEQUENCE: 2 

368 ctctgcccac tctggtagcc tcacgtagct taacaatctg tgactacagt t atg agg 57 

369 Met Arg 

370 1 

372 acc etc acc etc etc tct gec ttt etc ctg gtg gec ctt cag gec tgg 105 

373 Thr Leu Thr Leu Leu Ser Ala Phe Leu Leu Val Ala Leu Gin Ala Trp 

374 5 10 15 

376 gca gag ccg etc cag gca aga get cat gag atg cca gec cag aag cag 153 

377 Ala Glu Pro Leu Gin Ala Arg Ala His Glu Met Pro Ala Gin Lys Gin 

378 20 25 30 

380 cct cca gca gat gac cag gat gtg gtc att tac ttt tea gga gat gac 201 

381 Pro Pro Ala Asp Asp Gin Asp Val Val He Tyr Phe Ser Gly Asp Asp 

382 35 40 45 50 

384 age tgc tct ctt cag gtt cca ggc tea aca aag ggc ttg ate tgc cat 24 9 

385 Ser Cys Ser Leu Gin Val Pro Gly Ser Thr Lys Gly Leu He Cys His 

386 55 60 65 

388 tgc aga gta eta tac tgc att ttt gga gaa cat ctt ggt ggg acc tgc 297 

389 Cys Arg Val Leu Tyr Cys He Phe Gly Glu His Leu Gly Gly Thr Cys 

390 70 75 80 

392 ttc ate ctt ggt gaa cgc tac cca ate tgc tgc tac taa gettgeagae 346 

393 Phe He Leu Gly Glu Arg Tyr Pro He Cys Cys Tyr 
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RAW SEQUENCE LISTING ERROR SUMMARY 
PATENT APPLICATION: US/10/04 5 , 180A 



DATE: 06/03/2002 
TIME: 12:40:20 



Input Set : A:\EP.txt 

Output Set: N:\CRF3\060 32002\J045180A.raw 



Please Note: 

Use of n and/or Xaa have been detected in the Sequence Listing. Please review the 
Sequence Listing to ensure that a corresponding explanation is presented in the <220> 
to <223> fields of each sequence which presents at least one n or Xaa. 

Seq#:l; N Pos . 85,14 3,670,970,1111,1150,1780,1974,2117,2133,2186,2191,2 367 
Seq# :1; N Pos. 4123 
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VERIFICATION SUMMARY DATE: 06/03/2002 

PATENT APPLICATION: US/10/04 5 , 18 OA TIME: 12:40:20 



Input Set : A:\EP.txt 

Output Set: N:\CRF3\06032002\J045180A.raw 



L 


209 


M 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 


pos . 


:60 


L 


211 


M 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 


pos . 


:120 


L 


229 


M 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 


pos . 


:'660 


L 


239 


M 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 


pos . 


:960 


L 


243 


M 


341 


W 


(46) 


"n " 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 


pos . 


:1080 


L 


245 


M 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 


pos . 


:1140 


L 


265 


M 


341 


W 


(46) 


"n" 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 


pos . 


:1740 


L 


271 


M 


341 


w 


(46) 


"n " 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 


pos . 


:1920 


L 


277 


M 


341 


w 


(46) 


"n " 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 


pos . 


:2100 


L 


279 


M 


341 


w 


(46) 


" n " 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 




:2160 


L 


285 


M 


341 


w 


(46) 


" n " 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 


pos . 


:2340 


L: 


343 


M 


341 


w 


(46) 


" n " 


or 


"Xaa" 


used, 


for 


SEQ 


ID# 


1 


after 


pos . 


:4080 
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